AAAI.2020 - Student Abstract Track

Total: 129

#1 Sample Complexity Bounds for RNNs with Application to Combinatorial Graph Problems (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Nil-Jana Akpinar ; Bernhard Kratzwald ; Stefan Feuerriegel

Learning to predict solutions to real-valued combinatorial graph problems promises efficient approximations. As demonstrated based on the NP-hard edge clique cover number, recurrent neural networks (RNNs) are particularly suited for this task and can even outperform state-of-the-art heuristics. However, the theoretical framework for estimating real-valued RNNs is understood only poorly. As our primary contribution, this is the first work that upper bounds the sample complexity for learning real-valued RNNs. While such derivations have been made earlier for feed-forward and convolutional neural networks, our work presents the first such attempt for recurrent neural networks. Given a single-layer RNN with a rectified linear units and input of length b, we show that a population prediction error of ε can be realized with at most Õ(a4b/ε2) samples.1 We further derive comparable results for multi-layer RNNs. Accordingly, a size-adaptive RNN fed with graphs of at most n vertices can be learned in Õ(n6/ε2), i.,e., with only a polynomial number of samples. For combinatorial graph problems, this provides a theoretical foundation that renders RNNs competitive.

#2 LatRec: Recognizing Goals in Latent Space (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Leonardo Amado ; Felipe Meneguzzi

Recent approaches to goal recognition have progressively relaxed the requirements about the amount of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume that there is a domain expert capable of building complete and correct domain knowledge to successfully recognize an agent's goal. This is too strong for most real-world applications. We overcome these limitations by combining goal recognition techniques from automated planning, and deep autoencoders to carry out unsupervised learning to generate domain theories from data streams and use the resulting domain theories to deal with incomplete and noisy observations. Moving forward, we aim to develop a new data-driven goal recognition technique that infers the domain model using the same set of observations used in recognition itself.

#3 An Iterative Approach for Identifying Complaint Based Tweets in Social Media Platforms (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Gyanesh Anand ; Akash Gautam ; Puneet Mathur ; Debanjan Mahata ; Rajiv Ratn Shah ; Ramit Sawhney

Twitter is a social media platform where users express opinions over a variety of issues. Posts offering grievances or complaints can be utilized by private/ public organizations to improve their service and promptly gauge a low-cost assessment. In this paper, we propose an iterative methodology which aims to identify complaint based posts pertaining to the transport domain. We perform comprehensive evaluations along with releasing a novel dataset for the research purposes1.

#4 Entity Type Enhanced Neural Model for Distantly Supervised Relation Extraction (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Long Bai ; Xiaolong Jin ; Chuanzhi Zhuang ; Xueqi Cheng

Distantly Supervised Relation Extraction (DSRE) has been widely studied, since it can automatically extract relations from very large corpora. However, existing DSRE methods only use little semantic information about entities, such as the information of entity type. Thus, in this paper, we propose a method for integrating entity type information into a neural network based DSRE model. It also adopts two attention mechanisms, namely, sentence attention and type attention. The former selects the representative sentences for a sentence bag, while the latter selects appropriate type information for entities. Experimental comparison with existing methods on a benchmark dataset demonstrates its merits.

#5 Analysis of Parliamentary Debate Transcripts Using Community-Based Graphical Approaches (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Anjali Bhavan ; Mohit Sharma ; Ramit Sawhney ; Rajiv Ratn Shah

Gauging political sentiments and analyzing stances of elected representatives pose an important challenge today, and one with wide-ranging ramifications. Community-based analysis of parliamentary debate sentiments could pave a way for better insights into the political happenings of a nation and help in keeping the voters informed. Such analysis could be given another dimension by studying the underlying connections and networks in such data. We present a sentiment classification method for UK Parliament debate transcripts, which is a combination of a graphical method based on DeepWalk embeddings and text-based analytical methods. We also present proof for our hypothesis that parliamentarians with similar voting patterns tend to deliver similar speeches. We also provide some further avenues and future work towards the end.

#6 Complex Emotional Intelligence Learning Using Deep Neural Networks (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Belainine Billal ; Fatiha Sadat ; Hakim Lounis

Emotion recognition and mining tasks are often limited by the availability of manually annotated data. Several researchers have used emojis and specific hashtags as forms of training and supervision. This research paper proposes a new textual and social corpus, the corpus labeled using basic emotions following Plutchik's theory. Thus, This paper propose a first study for the representation and interpretation of complex emotional interactions, using deep neural networks.

#7 Improving Semantic Parsing Using Statistical Word Sense Disambiguation (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Ritwik Bose ; Siddharth Vashishtha ; James Allen

A Semantic Parser generates a logical form graph from an utterance where the edges are semantic roles and nodes are word senses in an ontology that supports reasoning. The generated representation attempts to capture the full meaning of the utterance. While the process of parsing works to resolve lexical ambiguity, a number of errors in the logical forms arise from incorrectly assigned word sense determinations. This is especially true in logical and rule-based semantic parsers. Although the performance of statistical word sense disambiguation methods is superior to the word sense output of semantic parser, these systems do not produce the rich role structure or a detailed semantic representation of the sentence content. In this work, we use decisions from a statistical WSD system to inform a logical semantic parser and greatly improve semantic type assignments in the resulting logical forms.

#8 Towards an Integrative Educational Recommender for Lifelong Learners (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Sahan Bulathwela ; María Pérez-Ortiz ; Emine Yilmaz ; John Shawe-Taylor

One of the most ambitious use cases of computer-assisted learning is to build a recommendation system for lifelong learning. Most recommender algorithms exploit similarities between content and users, overseeing the necessity to leverage sensible learning trajectories for the learner. Lifelong learning thus presents unique challenges, requiring scalable and transparent models that can account for learner knowledge and content novelty simultaneously, while also retaining accurate learners representations for long periods of time. We attempt to build a novel educational recommender, that relies on an integrative approach combining multiple drivers of learners engagement. Our first step towards this goal is TrueLearn, which models content novelty and background knowledge of learners and achieves promising performance while retaining a human interpretable learner model.

#9 Iterative Learning for Reliable Underwater Link Adaptation (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Junghun Byun ; Yong-Ho Cho ; Tae-Ho Im ; Hak-Lim Ko ; Kyung-Seop Shin ; Ohyun Jo

This paper describes an iterative learning framework consisting of multi-layer prediction processes for underwater link adaptation. To obtain a dataset in real underwater environments, we implemented OFDM (Orthogonal Frequency Division Multiplexing)-based acoustic communications testbeds for the first time. Actual underwater data measured in Yellow Sea, South Korea, were used for training the iterative learning model. Remarkably, the iterative learning model achieves up to 25% performance improvement over the conventional benchmark model.

#10 SATNet: Symmetric Adversarial Transfer Network Based on Two-Level Alignment Strategy towards Cross-Domain Sentiment Classification (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yu Cao ; Hua Xu

In recent years, domain adaptation tasks have attracted much attention, especially, the task of cross-domain sentiment classification (CDSC). In this paper, we propose a novel domain adaptation method called Symmetric Adversarial Transfer Network (SATNet). Experiments on the Amazon reviews dataset demonstrate the effectiveness of SATNet.

#11 CORAL-DMOEA: Correlation Alignment-Based Information Transfer for Dynamic Multi-Objective Optimization (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Li Chen ; Hua Xu

One essential characteristic of dynamic multi-objective optimization problems is that Pareto-Optimal Front/Set (POF/POS) varies over time. Tracking the time-dependent POF/POS is a challenging problem. Since continuous environments are usually highly correlated, past information is critical for the next optimization process. In this paper, we integrate CORAL methodology into a dynamic multi-objective evolutionary algorithm, named CORAL-DMOEA. This approach employs CORAL to construct a transfer model which transfer past well-performed solutions to form an initial population for the next optimization process. Experimental results demonstrate that CORAL-DMOEA can effectively improve the quality of solutions and accelerate the evolution process.

#12 Optimizing the Feature Selection Process for Better Accuracy in Datasets with a Large Number of Features (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Xi Chen ; Afsaneh Doryab

Most feature selection methods only perform well on datasets with relatively small set of features. In the case of large feature sets and small number of data points, almost none of the existing feature selection methods help in achieving high accuracy. This paper proposes a novel approach to optimize the feature selection process through Frequent Pattern Growth algorithm to find sets of features that appear frequently among the top features selected by the main feature selection methods. Our experimental evaluation on two datasets containing a small and very large number of features shows that our approach significantly improves the accuracy results of the dataset with a very large number of features.

#13 RPM-Oriented Query Rewriting Framework for E-commerce Keyword-Based Sponsored Search (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Xiuying Chen ; Daorui Xiao ; Shen Gao ; Guojun Liu ; Wei Lin ; Bo Zheng ; Dongyan Zhao ; Rui Yan

Sponsored search optimizes revenue and relevance, which is estimated by Revenue Per Mille (RPM). Existing sponsored search models are all based on traditional statistical models, which have poor RPM performance when queries follow a heavy-tailed distribution. Here, we propose an RPMoriented Query Rewriting Framework (RQRF) which outputs related bid keywords that can yield high RPM. RQRF embeds both queries and bid keywords to vectors in the same implicit space, converting the rewriting probability between each query and keyword to the distance between the two vectors. For label construction, we propose an RPM-oriented sample construction method, labeling keywords based on whether or not they can lead to high RPM. Extensive experiments are conducted to evaluate performance of RQRF. In a one month large-scale real-world traffic of e-commerce sponsored search system, the proposed model significantly outperforms traditional baseline.

#14 Learning to Model Opponent Learning (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Ian Davies ; Zheng Tian ; Jun Wang

Multi-Agent Reinforcement Learning (MARL) considers settings in which a set of coexisting agents interact with one another and their environment. The adaptation and learning of other agents induces non-stationarity in the environment dynamics. This poses a great challenge for value function-based algorithms whose convergence usually relies on the assumption of a stationary environment. Policy search algorithms also struggle in multi-agent settings as the partial observability resulting from an opponent's actions not being known introduces high variance to policy training. Modelling an agent's opponent(s) is often pursued as a means of resolving the issues arising from the coexistence of learning opponents. An opponent model provides an agent with some ability to reason about other agents to aid its own decision making. Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. Such an approach can reduce the variance of training signals for policy search algorithms. However, in the multi-agent setting, agents have an incentive to continually adapt and learn. This means that the assumptions concerning opponent stationarity are unrealistic. In this work, we develop a novel approach to modelling an opponent's learning dynamics which we term Learning to Model Opponent Learning (LeMOL). We show our structured opponent model is more accurate and stable than naive behaviour cloning baselines. We further show that opponent modelling can improve the performance of algorithmic agents in multi-agent settings.

#15 When Low Resource NLP Meets Unsupervised Language Model: Meta-Pretraining then Meta-Learning for Few-Shot Text Classification (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Shumin Deng ; Ningyu Zhang ; Zhanlin Sun ; Jiaoyan Chen ; Huajun Chen

Text classification tends to be difficult when data are deficient or when it is required to adapt to unseen classes. In such challenging scenarios, recent studies have often used meta-learning to simulate the few-shot task, thus negating implicit common linguistic features across tasks. This paper addresses such problems using meta-learning and unsupervised language models. Our approach is based on the insight that having a good generalization from a few examples relies on both a generic model initialization and an effective strategy for adapting this model to newly arising tasks. We show that our approach is not only simple but also produces a state-of-the-art performance on a well-studied sentiment classification dataset. It can thus be further suggested that pretraining could be a promising solution for few-shot learning of many other NLP tasks. The code and the dataset to replicate the experiments are made available at https://github.com/zxlzr/FewShotNLP.

#16 Efficient Spatial-Temporal Rebalancing of Shareable Bikes (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Zichao Deng ; Anqi Tu ; Zelei Liu ; Han Yu

Bike sharing systems are popular worldwide now. However, these systems are facing a problem - rebalancing of shareable bikes among different docking stations. To address this challenge, we propose an approach for the spatial-temporal rebalancing of shareable bikes which allows domain experts to optimize the rebalancing operation with their knowledge and preferences without relying on learning by trial-and-error.

#17 Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Akshay Dharmavaram ; Matthew Riemer ; Shalabh Bhatnagar

Option-critic learning is a general-purpose reinforcement learning (RL) framework that aims to address the issue of long term credit assignment by leveraging temporal abstractions. However, when dealing with extended timescales, discounting future rewards can lead to incorrect credit assignments. In this work, we address this issue by extending the hierarchical option-critic policy gradient theorem for the average reward criterion. Our proposed framework aims to maximize the long-term reward obtained in the steady-state of the Markov chain defined by the agent's policy. Furthermore, we use an ordinary differential equation based approach for our convergence analysis and prove that the parameters of the intra-option policies, termination functions, and value functions, converge to their corresponding optimal values, with probability one. Finally, we illustrate the competitive advantage of learning options, in the average reward setting, on a grid-world environment with sparse rewards.

#18 Multi-Agent Pattern Formation with Deep Reinforcement Learning (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Elhadji Amadou Oury Diallo ; Toshiharu Sugawara

We propose a decentralized multi-agent deep reinforcement learning architecture to investigate pattern formation under the local information provided by the agents' sensors. It consists of tasking a large number of homogeneous agents to move to a set of specified goal locations, addressing both the assignment and trajectory planning sub-problems concurrently. We then show that agents trained on random patterns can organize themselves into very complex shapes.

#19 American Sign Language Recognition Using an FMCW Wireless Sensor (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yuanqi Du ; Nguyen Dang ; Riley Wilkerson ; Parth Pathak ; Huzefa Rangwala ; Jana Kosecka

In today's digital world, rapid technological advancements continue to lessen the burden of tasks for individuals. Among these tasks is communication across perceived language barriers. Indeed, increased attention has been drawn to American Sign Language (ASL) recognition in recent years. Camera-based and motion detection-based methods have been researched extensively; however, there remains a divide in communication between ASL users and non-users. Therefore, this research team proposes the use of a novel wireless sensor (Frequency-Modulated Continuous-Wave Radar) to help bridge the gap in communication. In short, this device sends out signals that detect the user's body positioning in space. These signals then reflect off the body and back to the sensor, developing thousands of cloud points per second, indicating where the body is positioned in space. These cloud points can then be examined for movement over multiple consecutive time frames using a cell division algorithm, ultimately showing how the body moves through space as it completes a single gesture or sentence. At the end of the project, 95% accuracy was achieved in one-object prediction as well as 80% accuracy on cross-object prediction with 30% other objects' data introduced on 19 commonly used gestures. There are 30 samples for each gesture per person from three persons.

#20 Search Tree Pruning for Progressive Neural Architecture Search (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Deanna Flynn ; P. Michael Furlong ; Brian Coltin

Our neural architecture search algorithm progressively searches a tree of neural network architectures. Child nodes are created by inserting new layers determined by a transition graph into a parent network up to a maximum depth and pruned when performance is worse than its parent. This increases efficiency but makes the algorithm greedy. Simpler networks are successfully found before more complex ones that can achieve benchmark performance similar to other top-performing networks.

#21 Exploring Abstract Concepts for Image Privacy Prediction in Social Networks (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Gabriele Galfré ; Cornelia Caragea

Automatically detecting the private nature of images posted in social networks such as Facebook, Flickr, and Instagram, is a long-standing goal considering the pervasiveness of these networks. Several prior works to image privacy prediction showed that object tags from images are highly informative about images' privacy. However, we conjecture that other aspects of images captured by abstract concepts (e.g., religion, sikhism, spirituality) can improve the performance of models that use only the concrete objects from an image (e.g., temple and person). Experimental results on a Flickr dataset show that the abstract concepts and concrete object tags complement each other and yield the best performance when used in combination as features for image privacy prediction.

#22 Predicting Opioid Overdose Crude Rates with Text-Based Twitter Features (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Nupoor Gandhi ; Alex Morales ; Sally Man-Pui Chan ; Dolores Albarracin ; ChengXiang Zhai

Drug use reporting is often a bottleneck for modern public health surveillance; social media data provides a real-time signal which allows for tracking and monitoring opioid overdoses. In this work we focus on text-based feature construction for the prediction task of opioid overdose rates at the county level. More specifically, using a Twitter dataset with over 3.4 billion tweets, we explore semantic features, such as topic features, to show that social media could be a good indicator for forecasting opioid overdose crude rates in public health monitoring systems. Specifically, combining topic and TF-IDF features in conjunction with demographic features can predict opioid overdose rates at the county level.

#23 I Am Guessing You Can't Recognize This: Generating Adversarial Images for Object Detection Using Spatial Commonsense (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Anurag Garg ; Niket Tandon ; Aparna S. Varde

Can we automatically predict failures of an object detection model on images from a target domain? We characterize errors of a state-of-the-art object detection model on the currently popular smart mobility domain, and find that a large number of errors can be identified using spatial commonsense. We propose øurmodel , a system that automatically identifies a large number of such errors based on commonsense knowledge. Our system does not require any new annotations and can still find object detection errors with high accuracy (more than 80% when measured by humans). This work lays the foundation to answer exciting research questions on domain adaptation including the ability to automatically create adversarial datasets for target domain.

#24 VECA: A Method for Detecting Overfitting in Neural Networks (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Liangzhu Ge ; Yuexian Hou ; Yaju Jiang ; Shuai Yao ; Chao Yang

Despite their widespread applications, deep neural networks often tend to overfit the training data. Here, we propose a measure called VECA (Variance of Eigenvalues of Covariance matrix of Activation matrix) and demonstrate that VECA is a good predictor of networks' generalization performance during the training process. Experiments performed on fully-connected networks and convolutional neural networks trained on benchmark image datasets show a strong correlation between test loss and VECA, which suggest that we can calculate the VECA to estimate generalization performance without sacrificing training data to be used as a validation set.

#25 Does Speech Enhancement of Publicly Available Data Help Build Robust Speech Recognition Systems? (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Bhavya Ghai ; Buvana Ramanan ; Klaus Mueller

Automatic speech recognition(ASR) systems play a key role in many commercial products including voice assistants. Typically, they require large amounts of high quality speech data for training which gives an undue advantage to large organizations which have tons of private data. We investigated if speech data obtained from publicly available sources can be further enhanced to train better speech recognition models. We begin with noisy/contaminated speech data, apply speech enhancement to produce 'cleaned' version and use both the versions to train the ASR model. We have found that using speech enhancement gives 9.5% better word error rate than training on just the original noisy data and 9% better than training on just the ground truth 'clean' data. It's performance is also comparable to the ideal case scenario when trained on noisy and it's ground truth 'clean' version.